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Description 

Field of the Invention 

[0001 ] The present invention relates to dynamic facial 
feature sensing, and more particularly, to a vision-based 
motion capture system that allows real-time finding, 
tracking and classification of facial features for input into 
a graphics engine that animates an avatar. 

Background of the Invention 

[0002] Virtual spaces filled with avatars are an attrac- 
tive way to allow for the experience of a shared environ- 
ment. However, existing shared environments generally 
lack facial feature sensing of sufficient quality to allow 
for the incarnation of a user, i.e., the endowment of an 
avatar with the likeness, expressions or gestures of the 
user. Quality facial feature sensing is a significant ad- 
vantage because facial gestures are a primordial means 
of communications. Thus, the incarnation of a user aug- 
ments the attractiveness of virtual spaces. 
[0003] Existing methods of facial feature sensing typ- 
ically use markers that are glued to a person's face. The 
use of markers for facial motion capture is cumbersome 
and has generally restricted the use of facial motion cap- 
ture to high-cost application such as movie production. 
[0004] Accordingly, there exists a significant need for 
a vision based motion capture systems that implements 
convenient and efficient facial feature sensing. The 
present invention satisfies this need. 
[0005] In Wiskott L et al: "Face Recognition by elastic 
bunch graph matching" IEEE TRANSACTIONS ON 
PATTERN ANALYSIS AND MACHINE INTELLI- 
GENCE, vol. 19, no. 7, July 1997, pages 775-779, IEEE 
Comput. Soc. Press, USA, a system for recognising hu- 
man faces from single images out of a large database 
containing one image per person is described, where 
faces are represented by labelled graphs based on a 
Gabor wavelet transform. Image graphs of new faces 
are extracted by an elastic graph matching process and 
are compared by a simple similarity function. 
[0006] Maurer T et al: "Tracking and learning graphs 
and pose on image sequences of faces" PROCEED- 
INGS OF THE SECOND INTERNATIONAL CONFER- 
ENCE ON AUTOMATIC FACE AND GESTURE REC- 
OGNITION, KILLINGTON, VT, USA, 14-16 OCT.1996, 
pages 176-181, IEEE Comput. Soc. Press, USA, dem- 
onstrates a system capable of tracking, in real world im- 
age sequences, landmarks such as eyes, mouth, or chin 
on a face. The system tracks the face with or without 
prior knowledge of faces and the tracking results are 
used to estimate the pose of a face. Gabor filters are 
used as visual features. 

[0007] European Patent Application EP-A-0, 807,902 
(Cyberclass Limited, 1 9 November 1 997) describes a 
method and apparatus for generating moving charac- 
ters where an avatar is generated by combining a 3D 



representation of the structure of the character which 
changes in real time, with a 3D surface representation 
of the character mapped onto the structure representa- 
tion, and a 2D representation of frequency changing 

5 portions of the surface. 

[0008] Recursive Tracking of image points using la- 
belled graph matching is described in Chandrashekhar 
et al: 'Recursive Tracking of Image Points using La- 
belled Graph Matching", PROCEEDINGS OF THE IN- 

10 TERNATIONAL CONFERENCE ON SYSTEMS, MAN 
AND CYBERNETICS, 1991. 

Summary of the Invention 

is [0009] The present invention as set out in claims 1 
and 21 is embodied in an apparatus, and related meth- 
od, for sensing a person's facial movements, features 
or characteristic. The results of the facial sensing may 
be used to animate an avatar image. The avatar appa- 
ratus uses an image processing technique based on 
model graphs and bunch graphs that efficiently repre- 
sent image features as jets composed of wavelet trans- 
forms at landmarks on a facial image corresponding to 
readily identifiable features. The sensing system allows 
tracking of a person's natural characteristics without any 
unnatural elements to interfere with the person's natural 
characteristics. 

[001 0] The feature sensing process operates on a se- 
quence of image frames transforming each image frame 
using a wavelet transformation to generate a trans- 
formed image frame. Node locations associated with 
wavelets jets of a model graph to the transformed image 
frame are initialized by moving the model graph across 
the transformed image frame and placing the model 
graph at a location in the transformed image frame of 
maximum jet similarity between the wavelet jets at the 
node locations and the transformed image frame. The 
location of one or more node locations of the model 
graph is tracked between image frames. A tracked node 
is reinitialized if the node's position deviates beyond a 
predetermined position constraint between image 
frames. 

[001 1 ] In one embodiment of the invention , the facial 
feature finding may be based on elastic bunch graph 
matching for individualizing a head model. Also, the 
model graph for facial image analysis may include a plu- 
rality of location nodes (e.g., 18) associated with distin- 
guishing features on a human face. 
[001 2] Other features and advantages of the present 
invention should be apparent from the following descrip- 
tion of the preferred embodiments, taken in conjunction 
with the accompanying drawings, which illustrate, by 
way of example, the principles of the invention. 
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Brief Description of the Drawings 
[0013] 

FIG. 1 is a block diagram of an avatar animation 
system and process, according to the invention. 
FIG. 2 is block diagram of a facial feature sensing 
apparatus and process, according to the invention, 
for the avatar animation system and process of FIG. 
1. 

FIG. 3 is a block diagram of a video Image proces- 
sor for implementing the facial feature sensing ap- 
paratus of FIG. 2. 

FIG. 4 is a flow diagram, with accompanying pho- 
tographs, for illustrating a landmark finding tech- 
nique of the facial feature sensing apparatus and 
system of FIG. 2. 

FIG. 5 is a series of Images showing processing of 
a facial image using Gabor wavelets, according to 
the invention. 

FIG. 6 is a series of graphs showing the construc- 
tion of a jet, image graph, and bunch graph using 
the wavelet processing technique of FIG. 5, accord- 
ing to the invention. 

FIG. 7 Is a diagram of a model graph, according to 
the invention, for processing facial images. 
FIG. 8 includes two diagrams showing the use of 
wavelet processing to locate facial feature. 
FIG. 9 is a flow diagram showing a tracking tech- 
nique for tracking landmarks found by the landmark 
finding technique of FIG. 4. 
FIG. 10 is a diagram of a Gaussian image pyramid 
technique for illustrating landmark tracking in one 
dimension. 

FIG. 11 is a series of two facial images, with accom- 
panying graphs of pose angle versus frame number, 
showing tracking of facia! features over a sequence 
of 50 image frames. 

FIG. 12 is a flow diagram, with accompanying pho- 
tographs, for illustrating a pose estimation tech- 
nique of the facial feature sensing apparatus and 
system of FIG. 2. 

FIG. 13 is a schematic diagram of a face with ex- 
tracted eye and mouth regions, for illustrating a 
course-to-fine landmark finding technique. 
FIG. 14 are photographs showing the extraction of 
profile and facial features using the elastic bunch 
graph technique of FIG. 6. 
FIG. 15 is a flow diagram showing the generation 
of a tagged personalized bunch graph along with a 
corresponding gallery of image patches that en- 
compasses a variety of a person's expressions for 
avatar animation, according with the invention. 
FIG. 16 is a flow diagram showing a technique for 
animating an avatar using image patches that are 
transmitted to a remote site, and that are selected 
at the remote site based on transmitted tags based 
on facial sensing associated with a person's current 



facial expressions. 

FIG. 17 is a flow diagram showing rendering of a 
three-dimensional head image generated, based 
on facial feature positions and tags, using volume 
5 morphing integrated with dynamic texture genera- 
tion. 

FIG. 18 is a block diagram of an avatar animation 
system, according to the invention, that includes au- 
dio analysis for animating an avatar. 

w 

Detailed Description of the Preferred Embodiments 

[001 4] The present invention is embodied in an appa- 
ratus, and related method, for sensing a person's facial 

* 5 movements, features and characteristics and the like to 
generate and animate an avatar image based on the fa- 
cial sensing. The avatar apparatus uses an image 
processing technique based on model graphs and 
bunch graphs that efficiently represent image features 

20 as jets. The jets are composed of wavelet transforms 
are processed at node or landmark locations on an im- 
age corresponding to readily identifiable features. The 
nodes are acquired and tracked to animate an avatar 
image in accordance with the person's facial move- 
rs ments. Also, the facial sensing may use jet similarity to 
determine the person's facial features and characteris- 
tics thus allowing tracking of a person's natural charac- 
teristics without any unnatural elements that may inter- 
fere with the person's natural characteristics. 

30 [0015] As shown in FIG. 1 , the avatar animation sys- 
tem 10 of the invention includes an imaging system 12, 
a facial sensing process 14, a data communication net- 
work 16, a facial animation process 18, and an avatar 
display 20. The imaging system acquires and digitizes 

35 a live video image signal of a person thus generating a 
stream of digitized video data organized into image 
frames. The digitized video image data is provided to 
the facial sensing process which locates the person's 
face and corresponding facial features in each frame. 

40 The facial sensing process also tracks the positions and 
characteristics of the facial features from frame-to- 
frame. The tracking information may be transmitted via 
the network to one or more remote sites which receives 
the information and generates, using a graphics engine, 

45 an animated facial image on the avatar display. The an- 
imated facial image may be based on a photorealistic 
model of the person, a cartoon character or a face com- 
pletely unrelated to the user. 

[001 6] The imaging system 1 2 and the facial sensing 
so process 14 are shown in more detail in FIGS. 2 and 3. 
The imaging system captures the person's image using 
a digital video camera 22 which generates a stream of 
video image frames. The video image frames are trans- 
ferred into a video random-access memory (VRAM) 24 
55 for processing. A satisfactory imaging system is the Ma- 
trox Meteor II available from Matrox™ which generates 
digitizing images produced by a conventional CCD cam- 
era and transfers the images in real-time into the mem- 
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ory at a frame rate of 30Hz. The image frame is proc- 
essed by an image processor 26 having a centra! 
processing unit (CPU) 28 coupled to the VRAM and ran- 
dom-access memory RAM 30. The RAM stores program 
code and data for implementing the facial sensing and 5 
avatar animation processes. 

[0017] The facial feature process operates on the dig- 
itized images to find the person's facial feature (block 
32), track the features (block 34), and reinitializes fea- 
ture tracking as needed. The facial features also may 10 
be classified (block 36). The facial feature process gen- 
erates data associated with the position and classifica- 
tion of the facial features with is provided to an interface 
with the facial animation process (block 38) 
[0018] The facial feature may be located using an 1$ 
elastic graph matching shown in FIG. 4. In the elastic 
graph matching technique, a captured image (block 40) 
is transformed into Gabor space using a wavelet trans- 
formation (block 42)which is described below in more 
detail with respect to FIG. 5. The transformed image 20 
(block 44) is represented by 40 complex values, repre- 
senting wavelet components, per each pixel of the orig- 
inal image. Next, a rigid copy of a model graph, which 
is described in more detail below with respect to FIG. 7, 
is positioned over the transformed image at varying 25 
model node positions to locate a position of optimum 
similarity (block 46). The search for the optimum simi- 
larity may be performed by positioning the model graph 
in the upper left hand comer of the image, extracting the 
jets at the nodes, and determining the similarity between 30 
the image graph and the model graph. The search con- 
tinues by sliding the model graph left to right starting 
from the upper-left corner of the image (block 48). When 
a rough position of the face is found (block 60), the 
nodes are individually allowed to move, introducing 35 
elastic graph distortions (block 52). A phase-insensitive 
similarity function is used in order to locate a good match 
(block 54). A phase-sensitive similarity function is then 
used to locate a jet with accuracy because the phase is 
very sensitive to small jet displacements. The phase- *Q 
insensitive and the phase-sensitive similarity functions 
are described below with respect to FIGS. 5-8. Note that 
although the graphs are shown in FIG. 4 with respect to 
the original image, the model graph movements and 
matching are actually performed on the transformed im- 45 
age. 

[0019] The wavelet transform is described with refer- 
ence to FIG. 5. An original image is processed using a 
Gabor wavelet to generate a convolution result. The Ga- 
bor-based wavelet consists of a two-dimensional com- so 
plex wave field modulated by a Gaussian envelope. 

Vf(*)=^e 2o V t; -el|i} (1) 55 
[0020] The wavelet is a plane wave with wave vector 



£ , restricted by a Gaussian window, the size of which 
relative to the wavelength is parameterized by o. The 
term in the brace removes the DC component. The am- 
plitude of the wavevector k may be chosen as follows 
where v is related to the desired spacial resolutions. 

v+2 

ky = 2 2 tc,v = 1,2,... (2) 

A wavelet, centered at image position x is used to ex- 
tract the wavelet component f k from the image with gray 
level distribution /(x), 

J )f (x) = /dx'/(x')v J? (x.x 1 ) (3) 

[0021 ] The space of wave vectors £ is typically sam- 
pled in a discrete hierarchy of 5 resolution levels (differ- 
ing by half -octaves) and 8 orientations at each resolu- 
tion level (see, e.g., FIG. 8), thus generating 40 complex 
values for each sampled image point (the real and im- 
aginary components referring to the cosine and sine 
phases of the plane wave). The samples in k-space are 
designated by the index j = 1 ,..,40 and all wavelet com- 
ponents centered in a single image point are considered 
as a vector which is called a jet 60, shown in FIG. 6. 
Each jet describes the local features of the area sur- 
rounding x. If sampled with sufficient density, the image 
may be reconstructed from jets with in the bandpass cov- 
ered by the sampled frequencies. Thus, each compo- 
nent of a jet is the filter response of a Gabor wavelet 
extracted at a point (x, y) of the image. 
[0022] A labeled image graph 62, as shown in FIG. 6, 
is used to describe the aspect of an object (in this con- 
text, a face). The nodes 64 of the labeled graph refer to 
points on the object and are labeled by jets 60. Edges 
66 of the graph are labeled with distance vectors be- 
tween the nodes. Nodes and edges define the graph to- 
pology. Graphs with equal topology can be compared. 
The normalized dot product of the absolute components 
of two jets defines the jet similarity. This value is inde- 
pendent of contrast changes. To compute the similarity 
between two graphs, the sum is taken over similarities 
of corresponding jets between the graphs. 
[0023] A model graph 68 that is particularly designed 
for finding a human face in an image is shown in FIG. 
7. The numbered nodes of the graph have the following 
locations: 



I 0 


right eye pupil 


1 


left eye pupil 


2 


top of the nose 


3 


right corner of the right eyebrow 


4 


left corner of the right eyebrow 


5 


right corner of the left eyebrow 
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(continued) 


6 


left corner of the left eyebrow 


7 


right nostril 


8 


tip of the nose 


g 


left nostril 


10 


right comer of the mouth 


11 


center of the upper lip 


12 


left comer of the mouth 


13 


center of the lower lip 


14 


bottom of the right ear 


15 


top of the right ear 


16 


top of the left ear 


17 


bottom of the left ear 
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To represent a face, a data structure called bunch graph 
70 (FIG. 6) is used. It is similar to the graph described 
above, but instead of attaching only a single jet to each 
node, a whole bunch of jets 72 (a bunch jet) are attached 
to each node. Each jet is derived from a different facial 
image. To form a bunch graph, a collection of facial im- 
ages (the bunch graph gallery) is marked with node lo- 
cations at defined positions of the head. These defined 
positions are called landmarks. When matching a bunch 
graph to an image, the jet extracted from the image is 
compared to all jets in the corresponding bunch at- 
tached to the bunch graph and the best-matching one 
is selected. This matching process is called elastic 
bunch graph matching. When constructed using a judi- 
ciously selected gallery, a bunch graph covers a great 
variety of faces that may have significantly different local 
properties e.g. samples of male and female faces, and 
of persons of different ages or races. 
[0024] Again in orderto find a face in an image frame, 
the graph is moved and scaled and distorted until a 
place is located at which the graph matches best (the 
best fitting jets within the bunch jets are most similar to 
jets extracted from the image at the current positions of 
the nodes). Since face features differ from face to face, 
the graph is made more general for the task, e.g. each 
node is assigned with jets of the corresponding land- 
mark taken from 1 0 to 1 00 individual faces. 
[0025] Two different jet similarity functions fortwo dif- 
ferent, or even complementary, tasks are employed. If 
the components of a jet J are written in the form with 
amplitude and phase <fy one form for the similarity of two 
jets 5 and 5 1 is the normalized scalar product of the am- 
plitude vector 



S(J,J) = 



Za 



Jz af Z a) 2 
The other similarity function has the form 
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This function includes a relative displacement vector be- 
tween the image points to which the two jets refer. When 
comparing two jets during graph matching, the similarity 
between them is maximized with respect to d, leading 
to an accurate determination of jet position. Both simi- 
larity functions are used, with preference often given to 
the phase-insensitive version (which varies smoothly 
with relative position) when first matching a graph, and 
to the phase-sensitive version when accurately position- 
ing the jet. 

[0026] After the facial features are located, the facial 
features may be tracked over consecutive frames as il- 
lustrated in FIG. 9. The tracking technique of the inven- 
tion achieves robust tracking over long frame sequenc- 
es by using a tracking correction scheme that detects 
whether tracking of a feature or node has been lost and 
reinitializes the tracking process for that node. 
[0027] The position X_n of a single node in an image 
Ln of an image sequence is known either by landmark 
finding on image l_n using the landmark finding method 
(block 80) described above, or by tracking the node from 
image l_(n-1) to l_n using the tracking process. The 
node is then tracked (block 82) to a corresponding po- 
sition X__(n+1 ) in the image l_(n+1 ) by one of several 
techniques. The tracking methods described below ad- 
vantageously accommodate fast motion. 
[0028] A first tracking technique involves linear mo- 
tion prediction. The search for the corresponding node 
position X_(n+1 ) in the new image l_(n+1 ) is started at 
a position generated by a motion estimator. A disparity 
vector (X_n - X_(n-1)) is calculated that represents the 
displacement, assuming constant velocity, of the node 
between the preceding two frames. The disparity or dis- 
placement vector D_n may be added to the position X_n 
to predict the node position X_(n+1 ). This linear motion 
model is particularly advantageous for accommodating 
constant velocity motion. The linear motion model also 
provides good tracking if the frame rate is high com- 
pared to the acceleration of the objects being tracked. 
However, the linear motion model performs poorly if the 
frame rate is too low compared to the acceleration of the 
objects in the image sequence. Because it is difficult for 
any motion model to track objects under such condi- 
tions, use of a camera having a higher frame rate is rec- 
ommended. 

[0029] The linear motion model may generate too 
large of an estimated motion vector D_n which could 
lead to an accumulation of the error in the motion esti- 
mation. Accordingly, the linear prediction may be 
damped using a damping factor f_D. The resulting esti- 
mated motion vector is D_n = f_D * (X_n - X_(n-1)). A 
suitable damping factor is 0.9. If no previous frame l_ 
(n-1) exist, e.g., for a frame immediately after landmark 
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finding, the estimated motion vector is set equal to zero 
(D_n = 0). 

[0030] A tracking technique based on a Gaussian im- 
age pyramid, applied to one dimension, is illustrated in 
FIG. 10. Instead of using the original image resolution, 
the image is down sampled 2-4 times to create a Gaus- 
sian pyramid of the image. An image pyramid of 4 levels 
results in a distance of 24 pixels on the finest, original 
resolution level being represented as only 3 pixels on 
the coarsest level. Jets may be computed and com- 
pared at any level of the pyramid. 
[0031] Tracking of a node on the Gaussian image pyr- 
amid is generally performed first at the most coarse level 
and then preceding to the most fine level. A jet is ex- 
tracted on the coarsest Gauss level of the actual image 
frame l_(n+1 ) at the position X_(n+1 ) using the damped 
linear motion estimation X_ (n+1) = (X_n + D_n) as de- 
scribed above, and compared to the corresponding jet 
computed on the coarsest Gauss level of the previous 
image frame. From these two jets, the disparity is deter- 
mined, i.e. , the 2D vector R pointing from X_(n+1 ) to that 
position that corresponds best to the jet from the previ- 
ous frame. This new position is assigned to X_(n+1). 
The disparity calculation is described below in more de- 
tail. The position on the next finer Gauss level of the 
actual image (being 2*X_(n+1)), corresponding to the 
position X_(n+1) on the coarsest Gauss level is the 
starting point for the disparity computation on this next 
finer level. The jet extracted at this point is compared to 
the corresponding jet calculated on the same Gauss lev- 
el of the previous image frame. This process is repeated 
for all Gauss levels until the finest resolution level is 
reached, or until the Gauss level is reached which is 
specified for determining the position of the node corre- 
sponding to the previous frame's position. 
[0032] Two representative levels of the Gaussian im- 
age pyramid are shown in FIG. 10, a coarser level 
94above , a finer level 96 below. Each jet is assumed to 
have filter responses for two frequency levels. Starting 
at position 1 on the coarser Gauss level, X_(n+1) 
=X_n+D_n, a first disparity move using only the lowest 
frequency jet coefficients leads to position 2. A second 
disparity move by using all jet coefficients of both fre- 
quency levels leads to position 3, the final position on 
this Gauss level. Position 1 on the finer Gauss level cor- 
responds to position 3 on the coarser level with the co- 
ordinates being doubled. The disparity move sequence 
is repeated, and position 3 on the finest Gauss level is 
the final position of the tracked landmark. For more ac- 
curate tracking, the number of Gauss and frequency lev- 
els may be increased. 

[0033] After the new position of the tracked node in 
the actual image frame has been determined, the jets 
on all Gauss levels are computed at this position. A 
stored array of jets that was computed for the previous 
frame, representing the tracked node, is then replaced 
by a new array of jets computed for the current frame. 
[0034] Use of the Gauss image pyramid has two main 



advantages: First, movements of nodes are much small- 
er in terms of pixels on a coarser level than in the original 
image, which makes tracking possible by performing on- 
ly a local move instead of an exhaustive search in a large 
5 image region. Second, the computation of jet compo- 
nents is much faster for lower frequencies, because the 
computation is performed with a small kernel window on 
a down sampled image, rather than on a large kernel 
window on the original resolution image. 
[0035] Note, that the correspondence level may be 
chosen dynamically, e.g., in the case of tracking facial 
features, correspondence level may be chosen depend- 
ent on the actual size of the face. Also the size of the 
Gauss image pyramid may be alter through the tracking 
process, i.e., the size may be increased when motion 
gets faster, and decreased when motion gets slower. 
Typically, the maximal node movement on the coarsest 
Gauss level is limited 4 pixels. Also note that the motion 
estimation is often performed only on the coarsest level. 
[0036] The computation of the displacement vector 
between two given jets on the same Gauss level (the 
disparity vector), is now described. To compute the dis- 
placement between two consecutive frames, a method 
is used which was originally developed for disparity es- 
timation in stereo images, based on D. J. Fleet and A. 
D. Jepson. Computation of component image velocity 
from local phase information. In International Journal of 
Computer Vision, volume 5, issue 1, pages 77-104. 
1990. and W. M. Theimerand H. A. Mallot. Phase-based 
binocular vergence control and depth reconstruction us- 
ing active vision. In CVGIP: Image Undersanding, vol- 
ume 60, issue 3, pages 343-358. November 1994. 
[0037] The strong variation of the phases of the com- 
plex filter responses is used explicitly to compute the 
displacement with subpixel accuracy (Wiskott, L, "La- 
beled Graphs and Dynamic Link Matching for Face Rec- 
ognition and Scene Analysis", Verlag Ham Deutsch, 
Thun- Frankfurt am Main, Reihe Physik 53 (PhD thesis, 
1995). By writing the response J to the jth Gabor filter 
in terms of amplitude a y - wand phase j, a similarity func- 
tion can be defined as 

£.-a;a » cos(A; - 6 r - d • fc) 
S(J,J',d)= I I I U= il (6 ) 

Let J and J' be two jets at positions X and X'=X+d, the 
displacement d may be found by maximizing the simi- 
larity S with respect to d, the kj being the wavevectors 
associated with the filter generating Jj. Because the es- 
timation of d is only precise for small displacements, i. 
e., large overlap of the Gabor jets, large displacement 
vectors are treated as a first estimate only, and the proc- 
ess is repeated in the following manner. First, only the 
filter responses of the lowest frequency level are used 
resulting in a first estimate d_1. Next, this estimate is 
executed and the jet J is recomputed at the position 
X_J=X+d_1 , which is closer to the position X' of jet J 1 . 
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Then, the lowest two frequency levels are used for the 
estimation of the displacement d_2, and the jet J is rec- 
omputed at the position X_2 = X_1 + d_2. This is iterated 
until the highest frequency level used is reached, and 
the final disparity d between the two start jets J and J' 
is given as the sum d = d_1 + d_2 + .... Accordingly, dis- 
placements of up to half the wavelength of the kernel 
with the lowest frequency may be computed (see Wisko- 
tt 1995, supra). 

[0038] Although the displacements are determined 
using floating point numbers, jets may be extracted (i. 
e., computed by convolution) at (integer) pixel positions 
only, resulting in a systematic rounding error. To com- 
pensate for this subplxel error Ad, the phases of the 
complex Gabor filter responses should be shifted ac- 
cording to 

A<t> y = Ad-/c y (7) 

so that the jets will appear as if they were extracted at 
the correct subpixel position. Accordingly, the Gabor jets 
may be tracked with subpixel accuracy without any fur- 
ther accounting of rounding errors. Note that Gabor jets 
provide a substantial advantage in image processing 
because the problem of subpixel accuracy is more diffi- 
cult to address in most other image processing meth- 
ods. 

[0039] Tracking error may be detected by determining 
whether a confidence or similarity value is smaller than 
a predetermined threshold (block 84 of FIG. 9). The sim- 
ilarity (or confidence) value S may be calculated to indi- 
cate how well the two image regions in the two image 
frames correspond to each other simultaneous with the 
calculation of the displacement of a node between con- 
secutive image frames. Typically, the confidence value 
is close to 1 , indicating good correspondence. If the con- 
fidence value is not close to 1 , either the corresponding 
point in the image has not been found (e.g., because 
the frame rate was too low compared to the velocity of 
the moving object), or this image region has changed 
so drastically from one image frame to the next, that the 
correspondence is no longer well defined (e.g., for the 
node tracking the pupil of the eye the eyelid has been 
closed). Nodes having a confidence value below a cer- 
tain threshold may be switched off. 
[0040] A tracking error also may be detected when 
certain geometrical constraints are violated (block 86). 
If many nodes are tracked simultaneously, the geomet- 
rical configuration of the nodes may be checked for con- 
sistency. Such geometrical constraints may be fairly 
loose, e.g., when facial features are tracked, the nose 
must be between the eyes and the mouth. Alternatively, 
such geometrical constraints may be rather accurate, e. 
g., a model containing the precise shape information of 
the tracked face. For intermediate accuracy, the con- 
straints may be based on a flat plane model. In the flat 



plane model, the nodes of the face graph are assumed 
to be on a flat plane. For image sequences that start 
with the frontal view, the tracked node positions may be 
compared to the corresponding node positions of the 

5 frontal graph transformed by an aff ine transformation to 
the actual frame. The 6 parameters of the optimal affine 
transformation are found by minimizing the least 
squares error in the node positions. Deviations between 
the tracked node positions and the transformed node 

10 positions are compared to a threshold. The nodes hav- 
ing deviations larger than the threshold are switched off. 
The parameters of the affine transformation may be 
used to determine the pose and relative scale (com- 
pared to the start graph) simultaneously (block 88). 

15 Thus, this rough flat plane model assures that tracking 
errors may not grow beyond a predetermined threshold. 
[0041 ] If a tracked node is switched off because of a 
tracking error, the node may be reactivated at the correct 
position (block 90), advantageously using bunch graphs 

20 that include different poses and tracking continued from 
the corrected position (block 92). After a tracked node 
has been switched off, the system may wait until a pre- 
defined pose is reached for which a pose specific bunch 
graph exists. Otherwise, if only a frontal bunchgraph is 

25 stored, the system must until the frontal pose is reached 
to correct any tracking errors. The stored bunch of jets 
may be compared to the image region surrounding the 
fit position (e.g., from the flat plane model), which works 
in the same manner as tracking, except that instead of 

30 comparing with the jet of the previous image frame, the 
comparison is repeated with all jets of the bunch of ex- 
amples, and the most similar one is taken. Because the 
facial features are known, e.g., the actual pose, scale, 
and even the rough position, graph matching or an ex- 

35 haustive searching in the image and/or pose space is 
not needed and node tracking correction may be per- 
formed in real time. 

[0042] For tracking correction, bunch graphs are not 
needed for many different poses and scales because 

40 rotation in the image plane as well as scale may be taken 
into account by transforming either the local image re- 
gion or the jets of the bunch graph accordingly as shown 
in FIG. 11 . In addition to the frontal pose, bunch graphs 
need to be created only for rotations in depth. 

45 [0043] The speed of the reinitialization process may 
be increased by taking advantage of the fact that the 
identity of the tracked person remains the same during 
an image sequence. Accordingly, in an initial learning 
session, a first sequence of the person may be taken 

50 with the person exhibiting a full repertoire of frontal facial 
expressions. This first sequence may be tracked with 
high accuracy using the tracking and correction scheme 
described above based on a large generalized bunch 
graph that contains knowledge about many different 

55 persons. This process may be performed offline and 
generates a new personalized bunch graph. The per- 
sonalized bunch graph then may be used for tracking 
this person at a fast rate in real time because the per- 
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sonalized bunch graph is much smaller than the larger, 
generalized bunch graph. 

[0044] The speed of the reinitialization process also 
may be increased by using a partial bunch graph reini- 
tialization. A partial bunch graph contains only a subset 
of the nodes of a full bunch graph. The subset may be 
as small as only a single node. 
[0045] A pose estimation bunch graph makes use of 
a family of two-dimensional bunch graphs defined in the 
image plane. The different graphs within one family ac- 
count for different poses and/or scales of the head. The 
landmark finding process attempts to match each bunch 
graph from the family to the input image in order to de- 
termine the pose or size of the head in the image. An 
example of such pose-estimation procedure is shown in 
FIG. 12. The first step of the pose estimation is equiva- 
lent to that of the regular landmark finding. The image 
(block 98)is transformed (blocks 100 and 102) in order 
to use the graph similarity functions. Then, instead of 
only one, a family of three bunch graphs is used. The 
first bunch graph contains only the frontal pose faces 
(equivalent to the frontal view described above), and the 
other two bunch graphs contain quarter-rotated faces 
(one representing rotations to the left and one to the 
right). As before, the initial positions for each of the 
graphs is in the upper left corner, and the positions of 
the graphs are scanned on the image and the position 
and graph returning the highest similarity after the land- 
mark finding is selected (blocks 104-114) 
[0046] After initial matching for each graph, the simi- 
larities of the final positions are compared (block 116). 
The graph that best corresponds to the pose given on 
the image will have the highest similarity. In FIG. 1 2, the 
left-rotated graph provides the best fit to the image, as 
indicated by its similarity (block 1 1 8). Depending on res- 
olution and degree of rotation of the face in the picture, 
similarity of the correct graph and graphs for other poses 
would vary, becoming very close when the face is about 
half way between the two poses for which the graphs 
have been defined. By creating bunch graphs for more 
poses, a finer pose estimation procedure may be imple- 
mented that would discriminate between more degrees 
of head rotation and handle rotations in other directions 
(e.g. up or down). 

[0047] In order to robustly find a face at an arbitrary 
distance from the camera, a similar approach may be 
used in which two or three different bunch graphs each 
having different scales may be used. The face in the im- 
age will be assumed to have the same scale as the 
bunch graph that returns the most to the facial image. 
[0048] A three dimensional (3D) landmark finding 
techniques related to the technique described above al- 
so may use multiple bunch graphs adapted to different 
poses. However, the 3D approach employs only one 
bunch graph defined in 3D space. The geometry of the 
3D graph reflects an average face or head geometry. By 
extracting jets from images of the faces of several per- 
sons in different degrees of rotation, a 3D bunch graph 



is created which is analogous to the 2D approach. Each 
jet is now parameterized with the three rotation angles. 
As in the 2D approach, the nodes are located at the fi- 
ducial points of the head surface. Projections of the 3D 

5 graph are then used in the matching process. One im- 
portant generalization of the 3D approach is that every 
node has the attached parameterized family of bunch 
jets adapted to different poses. The second generaliza- 
tion is that the graph may undergo Euclidean transfor- 

10 mations in 3D space and not only transformations in the 
image plane. 

[0049] The graph matching process may be formulat- 
ed as a coarse-to-fine approach that first utilizes graphs 
with fewer nodes and kernels and then in subsequent 
is steps utilizes more dense graphs. The coarse-to-fine 
approach is particularly suitable if high precision locali- 
zation of the feature points in certain areas of the face 
is desired. Thus, computational effort is saved by adopt- 
ing a hierarchical approach in which landmark finding is 
first performed on a coarser resolution, and subsequent- 
ly the adapted graphs are checked at a higher resolution 
to analyze certain regions in finer detail. 
[0050] Further, the computational workload may be 
easily split on a multi-processor machine such that once 
the coarse regions are found, a few child processes start 
working in parallel each on its own part of the whole im- 
age. At the end of the child processes, the processes 
communicate the feature coordinates that they located 
to the master process, which appropriately scales and 
combines them to fit back into the original image thus 
considerably reducing the total computation time. 
[0051] As shown in FIG. 1 3, the facial features corre- 
sponding to the nodes may be classified to account for 
inappropriate tracking error indications such as, for ex- 
ample, blinking or mouth opening. Labels are attached 
to the different jets in the bunch graph corresponding 
the facial features, e.g., eye open/closed, mouth open/ 
closed, etc. The labels may be copied along with the 
corresponding jet in the bunch graph which is most sim- 
ilar one compared to the current image. The label track- 
ing may be continuously monitored, regardless of 
whether a tracking error detected. Accordingly, classifi- 
cation nodes may be attached to the tracked nodes for 
the following: 

eye open/closed 

- mouth open/closed 

- tongue visible or not 
hair type classification 
wrinkle detection (e.g., on the forehead) 

[0052] Thus, tracking allows utilization of two informa- 
tion sources. One information source is based on the 
feature locations, i.e. the node positions, and the other 
information source is based on the feature classes. The 
feature class information is more texture based and, by 
comparing the local image region with a set of stored 
examples, may function using lower resolution and 
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tracking accuracy then feature class information that is 
based solely on the node positions. 
[0053] The facial sensing of the invention may be ap- 
plied to the creation and animation of static and dynamic 
avatars as shown in FIG. 1 4. The avatar may be based 
on a generic facial model or based on a person specific 
facial model. The tracking and facial expression recog- 
nition may be used for the incarnation the avatar with 
the person's features. 

[0054] The generic facial model may be adapted to a 
representative number of individuals and maybe adapt- 
ed to perform realistic animation and rendering of a wide 
range of facial features and/or expressions. The generic 
a model may be obtained by the following techniques. 

1 . Mono-camera systems may be used (T. Akimoto 
et al. 1993) to produce a realistic avatar for use in 
low-end tele- immersion systems. Face profile infor- 
mation of individuals, as seen from sagital and coro- 
nal planes, may be merged to obtain the avatar. 

2. Stereo-camera systems are able to perform ac- 
curate 3-D measurements when the cameras are 
fully calibrated (camera parameters are computed 
through a calibration process). An individual facial 
model may then be obtained by fitting a generic fa- 
cial model to the obtained 3-D data. Because stereo 
algorithms do not provide accurate information on 
non-textured areas, projection of active-textured 
light may be used. 

3. Feature-based stereo techniques where markers 
are used on the individual face to compute precise 
3-D positions of the markers. 3-D information is then 
used to fit a generic model. 

4. 3-dimensionnal digitizers in which a sensor or lo- 
cating device is moved over each surface point to 
be measured. 

5. Active structured light where patterns are project- 
ed and the resulting video stream is processed to 
extract 3D measurements. 

6. Laser-based surface scanning devices (such as 
those developed by Cyberware, Inc) that provide 
accurate face measurements. 

7. A combination of the previous techniques. These 
differing techniques are not of equal convenience 
to the user. Some are able to obtain measurements 
on the individual in a one-time process (the face be- 
ing in a desired posture for the du ration of the meas- 
urement), while others need a collection of samples 
and are more cumbersome to use. 

[0055] A generic three-dimensional head model for a 
specific person can be generated using two facial imag- 
es showing a frontal and a profile view. Facial sensing 
enables efficiently and robustly generation of the 3-D 
head model. 

[0056] Facial contour extraction is performed together 
with the localization of the person's eyes, nose, mouth 
and chin. This feature location information may be ob- 



tained by using the using the elastic bunch graph tech- 
nique in combination with hierarchical matching to au- 
tomatically extract facial features as shown in FIG. 14. 
The feature location information may then be combined 

5 (see T. Akimoto and Y. Suenaga. Automatic Creation of 
3D Facial Models. In IEEE Computer Graphics & Appli- 
cations, pages 16-22. September 1993.) to produce a 
3D model of the person's head. A generic three dimen- 
sional head model is adapted so that its proportions are 

10 related to the image's measurements. Finally, both side 
and front images may be combined to obtain a better 
texture model for the avatar, i.e. the front view is used 
to texture map the front of the model and the side view 
is used for the side of the model. Facial sensing im- 

'5 proves this technique because extracted features may 
be labeled (known points may be defined in the profile) 
so that the two images need not be taken simuitaneous- 

iy. 

[0057] An avatar image may be animated by the fol- 
20 lowing common techniques (see F. I. Parke and K. Wa- 
ters. Computer Facial Animation. A K Peters, Ltd. Welle- 
sley, Massachusetts, 1996). 

1 . Key framing and geometric interpolation, where 
25 a number of key poses and expressions are de- 
fined. Geometric interpolation is then used between 
the keyframes to provide animation. Such a system 
is frequently referred to as a performance-based (or 
performance-driven) model. 
30 2. Direct parameterization which directly maps ex- 
pressions and pose to a set of parameters that are 
then used to drive the model. 
3. Pseudo-muscle models which simulate muscle 
actions using geometric deformations. 
35 4. Muscle-based models where muscles and skin 
are modeled using physical models. 

5. 2-D and 3-D Morphing which use 2D morphing 
between images in a video stream to produce 2D 
animation. A set of landmarks are identified and 

40 used to warp between two images of a sequence. 
Such a technique can be extended to 3D (See, F. 
F. Pighin, J. Hecker, D. Lischinski, R. Szeliski, and 
D. H. Salesin. Synthesizing Realistic Facial Expres- 
sions from Photographs. In SIGGRAPH 98 Confer- 
^5 ence Proceedings, pages 75-84. July 1 998.). 

6. Other approaches such as control points and fi- 
nite element models. 

[0058] For these techniques, facial sensing enhances 
50 the animation process by providing automatic extraction 
and characterization of facial features. Extracted fea- 
tures may be used to interpolate expressions in the case 
of key framing and interpolation models, or to select pa- 
rameters for direct parameterized models or pseudo- 
55 muscles or muscles models. In the case of 2-D and 3-D 
morphing, facial sensing may be used to automatically 
select features on a face providing the appropriate in- 
formation to perform the geometric transformation. 
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[0059] An example of an avatar animation that uses 
facial feature tracking and classification may be shown 
with respect to FIG. 15. During the training phase the 
individual is prompted for a series of predetermined fa- 
cial expressions (block 120), and sensing is used to s 
track the features (block 122). At predetermined loca- 
tions, jets and image patches are extracted for the var- 
ious expressions (block 124). Image patches surround- 
ing facial features are collected along with the jets 1 26 
extracted from these features. These jets are used later 10 
to classify or tag facial features 1 28. This is done by us- 
ing these jets to generate a personalized bunch graph 
and by applying the classification method described 
above. 

[0060] As shown in FIG. 16, for animation of an avatar, is 
the system transmits all image patches 128, as well as 
the image of the whole face 130 (the "face frame") minus 
the parts shown in the image patches to a remote site 
(blocks 132 & 134). The software for the animation en- 
gine also may need to be transmitted. The sensing sys- 20 
tern then observes the user's face and facial sensing is 
applied to determine which of the image patches is most 
similar to the current facial expression (blocks 136 & 
138). The image tags are transmitted to the remote site 
(block 140) allowing the animation engine to assemble 25 
the face 142 using the correct image patches. 
[0061] To fit the image patches smoothly into the im- 
age frame, Gaussian blurring may be employed. For re- 
alistic rendering, local image morphing may be needed 
because the animation may not be continuous in the 30 
sense that a succession of images may be presented 
as imposed by the sensing. The morphing may be real- 
ized using linear interpolation of corresponding points 
on the image space. To create intermediate images, lin- 
ear interpolation is applied using the following equa- 35 
tions: 

P, = (2-i)P, + (i-1)P 2 (7) 



l i = (2-i)l 1+ (i-1)l 2 (8) 



tainer" as the resulting image of the face after each fea- 
ture is removed. The animation is started and facial 
sensing is used to generate specific tags which are 
transmitted as described previously. Decoding occurs 
by selecting image pieces associated with the transmit- 
ted tag, e.g., the image of the mouth labeled with a tag 
"smiling-mouth" 146 (FIG. 16). 
[0063] A more advanced level of avatar animation 
may be reached when the aforementioned dynamic tex- 
ture generation is integrated with more conventional 
techniques of volume morphing as shown in FIG. 17). 
To achieve volume morphing, the location of the node 
positions may be used to drive control points on a mesh 
150. Next, the textures 152 dynamically generated us- 
ing tags are then mapped onto the mesh to generate a 
realistic head image 154. An alternative to using the 
sensed node positions as drivers of control points on the 
mesh is to use the tags to select local morph targets. A 
morph target is a local mesh configuration that has been 
determined for the different facial expressions and ges- 
tures for which sample jets have been collected. These 
local mesh geometries can be determined by stereo vi- 
sion techniques. The use of morph targets is further de- 
veloped in the following references community (see, J. 
R. Kent, W. E. Carlson, and R. E. Parent. Shape Trans- 
formation for Polyhedral Objects. In SIGGRAPH 92 
Conference Proceedings, volume 26, pages 47-54. Au- 
gust 1992; Pighin et al. 1998, supra). 
[0064] A useful extension to the vision-based avatar 
animation is to integrate the facial sensing with speech 
analysis in order to synthesize the correct lip motion as 
shown in FIG. 18. The lip synching technique is partic- 
ularly useful to map lip motions resulting from speech 
onto an avatar. It is also helpful as a back-up in case the 
vision-based lip tracking fails. 

[0065] Although the foregoing discloses the preferred 
embodiments of the present invention, it is understood 
that those skilled in the art may make various changes 
to the preferred embodiments without departing from 
the scope of the invention. The invention is defined only 
the following claims. 



where P 1 and P 2 are corresponding points in the images 

1, and l 2 , and l f is the I th interpolated image: with 1< i < 

2. Note that for process efficient, the image interpolation 
may be implemented using a pre-computed hash table 
for P, and I,. The number and accuracy of points used, 
and their accuracy, the interpolated facial model gener- 
ally determines the resulting image quality. 

[0062] Thus, the reconstructed face in the remote dis- 
play may be composed by assembling pieces of images 
corresponding to the detected expressions in the learn- 
ing step. Accordingly, the avatar exhibits features cor- 
responding to the person commanding the animation. 
Thus, at initialization, a set of cropped images corre- 
sponding to each tracked facial feature and a "face con- 



Claims 

1. A method for feature sensing on a sequence of im- 
age frames (40), comprising a step (42) for trans- 
forming each image frame (40) using a wavelet 
transformation to generate a transformed image 
frame (44), a step (46) for initializing nodes (64) of 
a model graph (68), each node (64) associated with 
a wavelet jet (60) specific to a feature, to locations 
on the transformed image frame (44) by moving the 
model graph (68) across the transformed image 
frame (44) and placing the model graph (68) at a 
location in the transformed image frame (44) of 
maximum jet similarity (50) between the wavelet 
jets (60) of the nodes (64) and locations on the 



50 



10 



19 



EP 1 072 018B1 



20 



transformed image frame (44) determined as the 
model graph (68) is moved across the transformed 
image frame (44), and a step (34) for tracking the 
location of one or more nodes (64) of the model 
graph (68) between image frames, characterized 
in that the method further comprises a step (90) for 
reinitializing the location of a tracked node (64) if 
the tracked node's location deviates, between im- 
age frames (40), beyond a predetermined position 
constraint, such that only the location of a tracked 
node (64) that has deviated beyond the predeter- 
mined position constraint is reinitialized and track- 
ing the location of one or more other nodes (64) of 
the model graph (68) that have not deviated beyond 
predetermined position constraint, between image 
frames (40), continues without reinitialization, and 
wherein the model graph (68) used in the initializing 
step (46) and in the reinitializing step (50) is based 
on a predetermined pose of the tracked object. 

2. A method for feature sensing as defined in claim 1 , 
characterized in that the tracking step (34) tracks 
the node (64) locations using elastic bunch graph 
matching. 

3. A method for feature sensing as defined in claim 1 , 
characterized in that the tracking step (34) uses 
linear position prediction to predict node (64) loca- 
tions in a subsequent image frame and the reinitial- 
izing step reinitializes a node (64) location based 
on a deviation from the predicted node (64) location 
that is greater than a predetermined threshold de- 
viation. 

4. A method for feature sensing as defined in claim 1 , 
characterized in that the predetermined position 
constraint is based on a geometrical position con- 
straint associated with relative positions of the node 
(64) locations. 

5. A method for feature sensing as defined in claim 1 , 
characterized in that the node (64) locations are 
transmitted to a remote site (16,1 8,20, 1 32, 1 34, 1 40) 
for animating an avatar image (142). 

6. A method for facial feature sensing as defined in 
claim 1 , characterized in that the node locations 
tracking step (34) includes lip synching based on 
audio signals associated with movement of the 
node locations specific to a mouth generating the 
audio signals. 

7. A method for facial feature sensing as defined in 
claim 1 , characterized in that the reinitializing step 
(90) is performed using bunch graph matching 
(104,108,112). 

8. A method for facial feature sensing as defined in 



claim 7, characterized in that the bunch graph 
matching (1 04, 1 08, 1 1 2) is performed using a partial 
bunch graph. 

5 9. A method for feature sensing as defined in claim 1 , 
characterized in that the tracking step (34) in- 
cludes determining a facial characteristic. 

10. A method for feature sensing as defined in claim 9, 
10 characterized in that the method further compris- . 

es transmitting the node locations and facial char- 
acteristics to a remote site (132,134,140) for ani- 
mating an avatar image (142) having facial charac- 
teristics which are based upon the facial character- 
's istics determined by the tracking step (34). 

11 . A method for feature sensing as defined in claim 9, 
characterized in that the facial characteristic de- 
termined by the tracking step (34) is whether a 

20 mouth is open or closed. 

12. A method for feature sensing as defined in claim 9, 
characterized in that the facia! characteristic de- 
termined by the tracking step (34) is whether eyes 

25 are open or closed. 

13. A method for feature sensing as defined in claim 9, 
characterized in that the facial characteristic de- 
termined by the tracking step (34) is whether a 

30 tongue is visible in the mouth. 

14. A method for feature sensing as defined in claim 9, 
characterized In that the facial characteristic de- 
termined by the tracking step (34) is based on facial 

35 wrinkles detected in the image. 

15. A method for feature sensing as defined in claim 9, 
characterized in that the facial characteristic de- 
termined by the tracking step (34) is based on hair 

40 type. 

16. A method for feature sensing as defined in claim 9, 
characterized in that each facial characteristic is 
associated by training (120) with an image tag 

45 (128,138) that identifies an image segment of the 
image frame (1 30) that is associated with the facial 
characteristic. 

17. A method for feature sensing as defined in claim 16, 
50 characterized in that the image segments identi- 
fied by the associated image tag (128,138) are 
morphed into an avatar image (142). 

18. A method for feature sensing as defined in claim 16, 
55 characterized In that the node locations and fea- 
ture tags are used for volume morphing the corre- 
sponding image segments into a three-dimensional 
image (154). 
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19. A method for feature sensing as defined in claim 9, 
wherein the model graph (68) comprises 1 8 nodes 
(64) associated with distinguishing features on a 
human face. 

20. A method for feature sensing as defined in claim 19, 
wherein the 18 node (64) locations of the face are 
associated with, respectively, 

a right eye pupil; 
a left eye pupil 
a top of a nose; 

a right corner of a right eyebrow; 
a left corner of the right eyebrow; 
a left corner of a left eyebrow; 
a right nostril; 
a tip of the nose; 
a left nostril; 

a right corner of a mouth; 
a center of an upper lip; 
a left corner of the mouth; 
a center of a lower lip; 
a bottom of a right ear; 
a top of the right ear; 
a top of a left ear; and 
a bottom of the left ear. 

21. Apparatus for feature sensing on a sequence of im- 
age frames (40), comprising means (12, 22, 24, 26, 
28, 30) for transforming each image frame (40) us- 
ing a wavelet transformation (42) to generate a 
transformed image frame (44), means (46) for ini- 
tializing nodes (64) of a model graph (68), each 
node (64) associated with a wavelet jet (60) specific 
to a feature, to locations on the transformed image 
frame (44) by moving the model graph (68) across 
the transformed image frame (44) and placing the 
model graph (68) at a location in the transformed 
image frame (44) of maximum jet similarity (50) be- 
tween the wavelet jets (60) of the nodes (64) and 
locations on the transformed image frame (44) de- 
termined as the model graph (68) is moved across 
the transformed image frame (44), and means (34) 
for tracking the location of one or more nodes (64) 
of the model graph (68) between image frames, 
characterized in that the apparatus further com- 
prises means for reinitializing a tracked node (64) if 
the tracked node's location deviates, between im- 
age frames (40), beyond a predetermined position 
constraint, such that only the location of a tracked 
node (64) that has deviated beyond the predeter- 
mined position constraint is reinitialized and track- 
ing the location of one or more other nodes (64) of 
the model graph (68) that have not deviated beyond 
predetermined position constraint, between image 
frames (40), continues without reinitialization, and 
wherein the model graph (68) used by the initializing 
means (46) and the reinitializing means (90) is 



based on a predetermined pose of the tracked ob- 
ject. 

22. Apparatus for feature sensing as defined in claim 
5 21 , characterized in that the apparatus further 
comprises means (1 4) for determining a facial char- 
acteristic, and means (18) for animating an avatar 
image having facial characteristics which are based 
upon the facial characteristics generated by the de- 
10 termining means (14). 



Patentanspruche 

15 1 . Verfahren zur Merkmalserkennung in einer Abfolge 
von Einzeibildern (40), mit einem Schritt (42) zum 
Transform ieren jedes Einzelbilds (40) durch eine 
Elementarwellen-Transformation, urn ein transfor- 
miertes Einzelbild (44) zu erzeugen, einem Schritt 

20 (46), um Knotenpunkte (64) eines Modell-Dia- 
gramms (68), von denen jeder Knotenpunkt (64) ei- 
nem fur ein Merkma! spezifischen Elementarwel- 
len-Strahl (60) zugeordnet ist, zu Positionen an 
dem transform ierten Einzelbild (44) zu initialisieren, 

25 indem das Modell-Diagramm (68) uber das eine 
transform ierte Einzelbild (44) bewegt wird und das 
Modell-Diagramm (68) an einer Position in dem 
transformierten Einzelbild (44) platziert wird, an der 
die maximale Strahl-Ahnlichkeit (50) zwischen den 

30 Elementarwellen-Strahlen (60) der Knotenpunkte 
(64) und Positionen an dem transformierten Einzel- 
bild (44) besteht, die bestimmt werden, wahrend 
das Modell-Diagramm (68) uber das transformierte 
Einzelbild (44) bewegt wird, und einem Schritt (34) 

35 zum Verfolgen der Position eines Oder mehrerer 
Knotenpunkte (64) des Modell-Diagramms (68) 
zwischen Einzeibildern, 

dadurch gekennzeichnet, dass das Verfahren fer- 
nereinen Schritt (90) zum Reinitialisieren der Posi- 

40 tion eines verfolgten Knotenpunkts (64), falls die 
Position des verfolgten Knotenpunkts (64) zwi- 
schen Einzeibildern (40) uber eine vorbestimmte 
Positionsbeschrankung hinaus abweicht, aufweist, 
derart, dass nur die Position eines verfolgten Kno- 

45 tenpunkts (64), die uber die vorbestimmte Positi- 
onsbeschrankung hinaus abgewichen ist, reinitiali- 
siert wird und die Position eines oder mehrerer wei- 
terer Knotenpunkte (64) des Modell-Diagramms 
(68), die nicht uber die vorbestimmte Positionsbe- 

50 schrankung hinaus zwischen Einzeibildern (40) ab- 
gewichen sind, ohne Reinitialisierung fortbesteht, 
und wobei das in dem Initialisierungs-Schritt (46) 
und dem Reinitialisierungs-Schritt (50) verwendete 
Modell-Diagramm (68) auf einer vorbestimmten 

55 Orientierung des verfolgten Objekts basiert. 

2. Verfahren zur Merkmalserkennung nach Anspruch 
1, dadurch gekennzeichnet, dass in dem Verfol- 
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gungs-Schritt (34) die Knotenpunkt(64)-Positionen 
durch elastisches Bunch Graph Matching verfolgt 
werden. 

3. Verfahren zur Merkmalserkennung nach Anspruch 
1 , dadurch gekennzeichnet, dass in dem Verfol- 
gungs-schritt (34) eine Linearpositions-Vorhersage 
zum Vorhersagen von Knotenpunkt(64)-Positionen 
in einem nachfolgenden Einzelbild verwendet wird 
und in dem Reinitialisierungs-Schritt eine Knoten- 
punkt(64)- Position auf der Basis einer Abweichung 
von der vorhergesagten Knotenpunkt(64)- Position 
reinitialisiert wird, die groBer ist als eine vorbe- 
stimmte Schwell-Abweichung. 

4. Verfahren zur Merkmalserkennung nach Anspruch 
1, dadurch gekennzeichnet, dass die vorbe- 
stimmte Positionsbeschrankung auf einer geome- 
trischen Positionsbeschrankung basiert, die Rela- 
tivpositionen der Knotenpunkt(64)-Positionen zu- 
geordnet ist. 

5. Verfahren zur Merkmalserkennung nach Anspruch 
1, dadurch gekennzeichnet, dass die Knoten- 
punkt(64)-Positionen an eine entfernte Position 
(16,18,20,132,134,140) zwecks Animierung eines 
Bilds einer virtuellen Figur (142) ubertragen wer- 
den. 

6. Verfahren zur Merkmalserkennung nach Anspruch 
1 , dadurch gekennzeichnet, dass in dem Knoten- 
punktpositionen-Verfolgungs-Schritt (34) eine Lip- 
pen-Synchronisierung basierend auf Audiosigna- 
len durchgefuhrt wird, die derjenigen Bewegung der 
Knotenpunkt-Positionen zugeordnet sind, welche 
fur einen die Audio-Signale erzeugenden Mund 
spezifisch sind. 

7. Verfahren zur Merkmalserkennung nach Anspruch 
1, dadurch gekennzeichnet, dass der Reinitiali- 
sierungs-Schritt (90) durch Gruppen-Diagramrn- 
Anpassung (104,108,112) durchgefuhrt wird. 

8. Verfahren zur Merkmalserkennung nach Anspruch 
7, dadurch gekennzeichnet, dass die Gruppen- 
Diagramm-Anpassung (104,108,112) durch ein 
Teilgruppen-Diagramm durchgefuhrt wird. 

9. Verfahren zur Merkmalserkennung nach Anspruch 
1, dadurch gekennzeichnet, dass in dem Verfol- 
gungs-Schritt (34) ein Gesichts-Merkmal bestimmt 
wird. 



ner virtuellen Figur (142) umfasst, das Gesichts- 
Merkmale aufweist, die auf den durch den Verfol- 
gungs-Schritt (34) bestimmten Gesichts-Merkma- 
len basieren. 

5 

11. Verfahren zur Merkmalserkennung nach Anspruch 
9, dadurch gekennzeichnet, dass das durch den 
Verfolgungs-Schritt (34) bestimmte Gesichts-Merk- 
mal darin besteht, ob ein Mund off en oder geschlos- 

10 sen ist 

12. Verfahren zur Merkmalserkennung nach Anspruch 
9, dadurch gekennzeichnet, dass das durch den 
Verfolgungs-Schritt (34) bestimmte Gesichts-Merk- 

15 mal darin besteht, ob Augen often oder geschlos- 
sen sind. 

13. Verfahren zur Merkmalserkennung nach Anspruch 
9, dadurch gekennzeichnet, dass das durch den 

20 Verfolgungs-Schritt (34) bestimmte Gesichts-Merk- 
mal darin besteht, ob eine Zunge im Mund sichtbar 
ist. 

14. Verfahren zur Merkmalserkennung nach Anspruch 
2s 9, dadurch gekennzeichnet, dass das durch den 

Verfolgungs-Schritt (34) bestimmte Gesichts-Merk- 
mal auf im Biid detektierten Gesichtsfalten basiert. 

15. Verfahren zur Merkmalserkennung nach Anspruch 
30 g, dadurch gekennzeichnet, dass das durch den 

Verfolgungs-Schritt (34) bestimmte Gesichts-Merk- 
mal auf dem Haar-Typ basiert. 

16. Verfahren zur Merkmalserkennung nach Anspruch 
35 9, dadurch gekennzeichnet, dass jedes Gesichts- 
Merkmal durch Training (120) einer Bild-Markie- 
rung (128,138) zugeordnet wird, die ein Bild-Seg- 
ment des Einzelbilds (130) identif iziert, das dem 
Gesichts-Merkmal zugeordnet ist. 

40 

17. Verfahren zur Merkmalserkennung nach Anspruch 
16, dadurch gekennzeichnet, dass die durch die 
zugeordnete Bild-Markierung (128,138) identifizier- 
ten Bild-Segmente morphemweise in ein Bild einer 

45 virtuellen Figur (142) eingebracht werden. 

18. Verfahren zur Merkmalserkennung nach Anspruch 
16, dadurch gekennzeichnet, dass die Knoten- 
punkt-Positionen und Merkmals-Markierungen zur 

50 Volumen-Morphemeinbringung der entsprechen- 
den Bild-Segmente in ein dreidimensionales Bild 
verwendet werden. 



10. Verfahren zur Merkmalserkennung nach Anspruch 
9, dadurch gekennzeichnet, dass das Verfahren 
fernerdas Ubertragen der Knotenpunkt-Positionen 
und Gesichts-Merkmale an eine entfernte Position 
(132,134, 140) zwecks Animierung eines Bilds ei- 



19. Verfahren zur Merkmalserkennung nach Anspruch 
55 9, dadurch gekennzeichnet, dass das Modell- 
Diagramm (68) 18 Knotenpunkte (64) aufweist, die 
unterscheidenden Merkmalen eines menschlichen 
Gesichts zugeordnet sind. 
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20. Verfahren zur Merkmalserkennung nach Anspruch 
19, dadurch gekennzeichnet, dass die 18 Kno- 
tenpunkt(64)-Positionen des Gesichts im Einzelnen 
betreffen: 

die Pupille des rechten Auges; 

die Pupille des linker) Auges; 

den oberen Bereich der Nase; 

die rechte Ecke der rechten Augenbraue; 

die linke Ecke der rechten Augenbraue; 

die linke Ecke der iinken Augenbraue; 

das rechte Nasenloch; 

die Nasenspitze; 

das linke Nasenloch; 

den rechten Mundwinkel; 

die Mitte der Oberlippe; 

den Iinken Mundwinkel; 

die Mitte der Unterlippe; 

den Unterbereich des rechten Ohrs; 

den Oberbereich des rechten Ohrs; 

den Unterbereich des Iinken Ohrs; und 

den Oberbereich des Iinken Ohrs. 

21 . Vorrichtung zur Merkmalserkennung in einer Abfol- 
ge von Einzelbildem (40), mit elner Einrichtung 
(1 2,22,24,26,28,30) zum Transformieren jedes Ein- 
zelbilds (40) durch eine Elementarwellen (wavelet) 
-Transformation, um ein transformiertes Einzelbild 
(44) zu erzeugen, einer Einrichtung (46), um Kno- 
tenpunkte (64) eines Modell-Diagramms (68), von 
denen jeder Knotenpunkt (64) einem fur ein Merk- 
mal spezifischen Elementarwellen-Strahl (60) zu- 
geordnet ist, zu Positionen an dem transform ierten 
Einzelbild (44) zu initialisieren, indem das Modell- 
Diagramm (68) Qberdas einetransformierte Einzel- 
bild (44) bewegt wird und das Modell-Diagramm 
(68) an einer Position in dem transform ierten Ein- 
zelbild (44) platziert wird, an der die maximale 
Strahl-Ahnlichkeit (50) zwischen den Elementar- 
wellen-Strahlen (60) der Knotenpunkte (64) und 
Positionen an dem transformierten Einzelbild (44) 
besteht, die bestimmt werden , wahrend das Modell- 
Diagramm (68) uber das transformierte Einzelbild 
(44) bewegt wird, und einer Einrichtung (34) zum 
Verfolgen der Position eines odermehrerer Knoten- 
punkte (64) des Modell-Diagramms (68) zwischen 
Einzelbildem, 

dadurch gekennzeichnet, dass das Verfahren fer- 
ner eine Einrichtung zum Reinitialisieren eines ver- 
folgten Knotenpunkts (64) aufweist, falls die Positi- 
on des verfolgten Knotenpunkts zwischen Einzel- 
bildem (40) uber eine vorbestimmte Positionsbe- 
schrankung hinaus abweicht, derart, dass nur die 
Position eines verfolgten Knotenpunkts (64), die 
uber die vorbestimmte Positionsbeschrankung hin- 
aus abgewichen ist, reinitialisiert wird und die Posi- 
tion eines Oder mehrerer weiterer Knotenpunkte 
(64) des Modell-Diagramms (68), die nichtuber die 



vorbestimmte Positionsbeschrankung hinaus zwi- 
schen Einzelbildem (40) abgewichen sind, ohne 
Reinitialisierungfortbesteht, und wobei der von der 
Inibalisierungs-Einrichtung (46) und dem Reinitiali- 
5 sierungs-Einrichtung (90) verwendete Modell-Dia- 
gramm (68) auf einer vorbestimmten Orientierung 
des verfolgten Objekts basiert. 

22. Vorrichtung zur Merkmalserkennung nach An- 
spruch 21, dadurch gekennzeichnet dass die 

Vorrichtung femer eine Einrichtung (14) zum Be- 
stimmen eines Gesichts-Merkmals und eine Ein- 
richtung (08) zur Animation eines Biids mit virtueller 
Figur mit Gesichts-Merkmalen aufweist, die auf den 
von der Bestimmungs-Einrichtung (14) generierten 
Gesichts- Eigenschaften basieren. 



Revendications 

1 . Pro cede pour detecter des caracteristiques sur une 
sequence de cadres d'image (40), comprenant une 
Stape (42) consistant a transformer chaque cadre 
d'image (40) en utilisant une transformation en on- 
delettes afin de gSnSrer un cadre d'image transfor- 
ms (44), une etape (46) consistant a initialiser des 
noeuds (64) d'un graphique de modele de simula- 
tion (68), chaque noeud (64) etant associS a un jet 
d'ondelettes (60) specifique a une caracteristique, 
a des emplacements situes sur le cadre d'image 
transform e (44) en deplacant le graphique de mo- 
dele de simulation (68) a travers le cadre d'image 
transforme (44) et en placant le graphique de mo- 
dele de simulation (68) en un emplacement situe 
sur le cadre d'image transforms (44) de similarity 
de jets maximaie (50) entre les jets d'ondelettes 
(60) des noeuds (64) et les emplacements situes 
sur le cadre d'image transforms (44) dSterminS lors- 
que le graphique de modele de simulation (68) est 
dSplacS a travers le cadre d'image transforms (44), 
et une Stape (34) consistant a suivre ['emplacement 
d'un ou de plusieurs noeud(s) (64) du graphique de 
modele de simulation (68) entre les cadres d'image, 
caractSrisS en que le procSdS comprend, en outre, 
une Stape (90) consistant a rSinitialiser I'emplace- 
ment d'un noeud (64) suivi si I'emplacement du 
noeud suivi devie entre les cadres d'image (40), au- 
dela d'une contrainte de position prSdSterminSe, 
afin que seul I'emplacement d'un noeud (64) suivi 
qui a dSviS au-dela de la contrainte de position pre- 
dSterminSe soit rSinitialisS, et le suivi de I'emplace- 
ment d'un ou de plusieurs noeud(s) (64) du graphi- 
que de modele de simulation (68) qui n'a ou n'ont 
pas dSvie au-dela d'une contrainte de position pre* 
dSterminSe, entre des cadres d'image (40), conti- 
nue sans rSinitialisation , et dans iequel le graphique 
de modSle de simulation (68) utilisS dans I'Stape 
^initialisation (46) et dans I'Stape de rSinitialisation 
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(90) est bas£ sur une pose predeterminee de I'objet 
suivi. 

2. Procedd pour detecter des caracteristiques selon la 
revendication 1 , caracterise en ce que I'etape de 5 
suivi (34) consiste a suivre les emplacements de 
noeuds (64) en utilisant une concordance elastique 

de graphiques groupes. 

3. Proced6 pour detecter des caracteristiques selon !a io 
revendication 1 , caracterise en ce que I'etape de 
suivi (34) utilise la prevoyance d'une position iineai- 

re afin de prevoir des emplacements de noeuds 
(64) dans un cadre d'image qui suit, et I'etape de 
reinitialisation reinitialise un emplacement de 15 
noeud (64) base sur une deviation par rapport a 
('emplacement prevu de noeuds (64) qui est supe- 
rieure a une deviation de seuil predeterminee. 



comprend, en outre, la transmission des emplace- 
ments de noeuds et des caracteristiques facial es a 
un site eloigne (132, 134, 140) dans le but d'animer 
une image d'avatar (141) ayant des caracteristi- 
ques faciaies qui sont basees sur les caracteristi- 
ques faciales determinees par I'etape de suivi (34). 

1 1 . Precede pour detecter des caracteristiques selon la 
revendication 9, caracterise en ce que la caracte- 
ristique faciale determined par i'etape de suivi (34) 
correspond, sort a une bouche ouverte, sort a une 
bouche termed. 

12. Procede^our detecter des caracteristiques selon la 
revendication 9, caracterise en ce que la caracte- 
ristique faciale determined par Idtape de suivi (34) 
correspond, soit & des yeux ouverts, so it a des yeux 
fermes. 



4. Procede pour detecter des caracteristiques selon la 20 
revendication 1 , caracterise en ce que la contrain- 

te de position predeterminee est basee sur une con- 
trainte de position geometrique associee a des po- 
sitions relatives des emplacements de noeuds (64). 

25 

5. Procede pour detecter des caracteristiques selon la 
revendication 1 , caracterise en ce que les empla- 
cements de noeuds (64) sont transmis a un site eloi- 
gne (16,18, 20, 1 32, 1 34, 1 40) dans le but d'animer 
une image d'avatar (142). 30 

6. Procede pour detecter des caracteristiques faciales 
selon la revendication 1 , caracterise en ce que 
I'etape (34) consistant a suivre les emplacements 

de noeuds comprend une synchronisation de levres 35 
basee sur des signaux audibles associes au mou- 
vement des emplacements de noeuds specifiques 
a une bouche generant des signaux audibles. 

7. Procede pour detecter des caracteristiques faciales *o 
selon la revendication 1 , caracterise en ce que 
I'etape de reinitialisation (90) est realised en utili- 
sant une concordance (104, 108, 112) de graphi- 
ques groupes. 

45 

8. Procede pour detecter des caracteristiques faciales 
selon la revendication 7, caracterise en ce que la 
concordance (104, 108, 112) de graphiques grou- 
pes est realised en utilisant un graphique groupe 
partiel. so 

9. Procede pour detecter des caracteristiques selon la 
revendication 1, caracterise en ce que I'etape de 
suivi (34) comprend la determination d'une carac- 
teristique faciale. 55 

10. Procede pour detecter des caracteristiques selon la 
revendication 9, caracterise en ce que le procede 



1 3. Procede pour detecter des caracteristiques selon la 
revendication 9, caracterise en ce que la caracte- 
ristique faciale d6terminee par I'etape de suivi (34) 
permet de determiner si oui ou non une langue est 
visible dans la bouche. 

14. Procede pour detecter des caracteristiques selon la 
revendication 9, caracterise en ce que la caracte- 
ristique faciale determined par I'etape de suivi (34) 
est basee sur des plis faciaux detectes sur I'image. 

15. Procede pour detecter des caracteristiques selon la 
revendication 9, caracterise en ce que la caracte- 
ristique faciale d6terminee par I'etape de suivi (34) 
est basee sur le type des cheveux. 

1 6. Procede pour detecter des caracteristiques selon la 
revendication 9, caracterise en ce que chaque ca- 
racteristique faciale est associee par entrainement 
(1 20) a une etiquette d'image (1 28, 1 38) qui recon- 
naTt un segment d'image du cadre d'image (1 30) qui 
est associe a la caracteristique faciale. 

1 7. Procede pour detecter des caracteristiques selon la 
revendication 16, caracterise en ce que ies seg- 
ments d'image reconnus par Idtiquette d'image 
(128, 138) associee sont formed selon une image 
d'avatar (142). 

1 8. Procede pour detecter des caracteristiques selon la 
revendication 16, caracterise en ce que les em- 
placements de noeud et les etiquettes de caracte- 
ristique sont utilises pour la formation volumique 
des segments d'image correspondants selon une 
image a trois dimensions (154). 

19. Procede pour detecter des caracteristiques selon la 
revendication 1 6, dans lequel le graphique de mo- 
dele de simulation (68) comprend 18 noeuds (64) 
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associes aux caracteristiques de distinction d'un vi- 
sage humain. 



tion (68) qui n'a ou n'ont pas devie au-dela d'une 
contrainte de position predeterminee, entre des ca- 
dres d' image (40), continue sans reinitialisation, et 
dans lequei le graphique de modele de simulation 
(68) utilise par le moyen d' initialisation (46) et le 
moyen de reinitialisation (90) est base sur une pose 
predeterminee de i'objet suivi. 



20. Procede pour detecter des caracteristiques selon la 
revendication 1 9, dans lequei les 1 8 emplacements 5 
de noeuds (64) du visage sont associes, respecti- 
vement, a : 



une pupiile d'oeil droit ; 

une pupiile d'oeil gauche ; 

un sommet de nez ; 

un angle droit de sourcil droit ; 

un angle gauche de sourcil droit ; 

un angle gauche de sourcil gauche ; 

une narine droite ; 

une pointe de nez ; 

une narine gauche ; 

un angle droit de bouche ; 

un centre de levre superieure ; 

un angle gauche de bouche ; 

un centre de levre inferieure ; 

une base d'oreille droite ; 

un sommet d'oreille droite, 

un sommet d'oreille gauche ; et 

une base d'oreille gauche. 
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22. 



Appareil pour detecter des caracteristiques selon 
1 a revendication 21 , caracterise en ce que Tappa- 
reil comprend, en outre, un moyen (14) pour deter- 
miner une caracteristique faciale et un moyen (1 8) 
pour animer une image d'avatar ayant des caracte- 
ristiques faciales qui sont basees sur les caracte- 
ristiques faciales generees par le moyen de deter- 
mination (14). 



21 . Appareil destine a detecter des caracteristiques sur 
une sequence de cadres d'image (40), comprenant 
des moyens (12, 22, 24, 26, 28, 30) pour transfor- 
mer chaque cadre d'image (40) utilisant une trans- so 
formation en ondelettes (42) afin de generer un ca- 
dre d'image transforme (44), un moyen (46) pour 
initialiser des noeuds (64) d'un graphique de mode- 
le de simulation (68), chaque noeud (64) etant as- 
socie a un jet d'ondelettes (60) specifique a une ca- 35 
racteristique, a des emplacements situes sur le ca- 
dre d'image transforme (44) en deplacant le graphi- 
que de modele de simulation (68) a travers le cadre 
d'image transforme (44) et en placant le graphique 
de modele de simulation (68) en un emplacement 40 
situe sur le cadre d'image transforme (44) de simi- 
larite de jets maximale (50) entre les jets d'ondelet- 
tes (60) des noeuds (64) et les emplacements si- 
tues sur le cadre d'image transforme (44) determine 
lorsque le graphique de modele de simulation (68) 45 
est deplace a travers le cadre d'image transforme 
(44), et un moyen (34) pour suivre I'emplacement 
d'un ou de plusieurs noeud(s) (64) du graphique de 
modele de simulation (68) entre les cadres d'image, 
caracterise en que I'appareil comprend, en outre, so 
un moyen pour reinitialiser un noeud (64) suivi si 
I'emplacement du noeud suivi devie entre les ca- 
dres d'image (40), au-dela d'une contrainte de po- 
sition predeterminee, afin que seul I'emplacement 
d'un noeud (64) suivi qui a devie au-dela de la con- 55 
trainte de position predeterminee soit reinitialise, et 
le suivi de I'emplacement d'un ou de plusieurs 
noeud(s) (64) du graphique de modele de simula- 
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convolution result 
Gabor wavelets imaginary part magnitude 
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FIG. 7 
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FIG. 8 
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Initial graph placement Final graph placement 




FIG. 12 
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FIG. 13 
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FJG. 14 
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